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Abstract  -  An  Information  Fusion  (IF)  based  Decision 
Support  Tool  (DST)  is  presented  to  aid  the  identification  of 
a  target,  from  a  large  set  of  candidates,  carrying  out  a 
pattern  of  activity  which  could  be  comprised  of  a  wide 
variety  of  possible  sub -activities  and  chronologies  of 
events.  The  overall  activity  can  only  be  defined  in  terms  of 
its  impact  and  in  some  cases  detectable  signatures  of  sub¬ 
activities.  Hidden  Markov  Models  (HMMs)  and  time  series 
anomaly  detection  methods  process  multi-modal  sensor 
data  which  are  then  integrated  by  a  novel,  efficient  Bayes¬ 
ian  IF  algorithm  to  provide  a  probability  that  each  candi¬ 
date  under  observation  is  carrying  out  the  target  activity. 
The  DST  has  been  developed  to  prototype  status  by  imple¬ 
menting  this  framework  using  commercial  off  the  shelf 
(COTS)  software.  The  DST  allows  the  decision  maker  to 
rapidly  access  current  and  historical  situational  awareness 
pictures  quantifying  the  progress  of  the  overall  search.  A 
range  of  geospatial  visualization  and  data  interrogation 
features  available  to  the  decision  maker  are  described  and 
their  performance  is  qualitatively  evaluated.  Finally, 
planned  future  developments  are  outlined. 
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1  Introduction 

The  task  of  identifying  an  unknown  pattern  of  activity 
which  may  be  being  carried  out  by  one  or  more  individuals 
in  a  large  set  of  candidates  is  a  complex  and  challenging 
problem  facing  the  military  today.  For  example,  candidates 
may  refer  to  people,  buildings  or  vehicles.  In  many  cases 
only  the  output  or  impact  of  a  target  activity  is  known  in 
advance  meaning  that  there  are  no  prescribed  sub-activities 
which  must  take  place  for  the  overall  activity  to  be  success¬ 
ful.  For  example  the  pattern  of  activity  carried  out  by  an 
individual  or  group  intending  to  threaten  a  military  base  may 
only  be  defined  in  terms  of  the  form  of  the  final  attack. 
Overall  activity  success  is  often  achievable  using  a  large 
number  of  possible  sub-activities  which  could  be  combined 
in  a  large  number  of  possible  chronologies.  This  means  that 


many  traditional  pattern  recognition  techniques  will  fail  due 
to  the  inability  to  accurately  define  a  library  of  target  pat¬ 
terns  to  monitor  for.  Sub-activities  may  have  known  detect¬ 
able  signatures  but  these  are  often  temporally  sparse  and 
with  low  signal  to  noise  ratio.  When  combined  with  low 
duty  cycles  for  sensors  this  means  that  any  single  sensing 
solution  is  likely  to  have  a  low  probability  of  detection. 
Furthermore  the  signatures  of  different  sub-activities  may  be 
spread  across  multiple  transmission  modes  for  which  there  is 
no  single  cross-modal  sensing  technology.  Benign  activity 
being  carried  out  by  other  candidates  often  introduces  con¬ 
founding  and  confusable  signals  which  act  to  mask  the  pres¬ 
ence  of  a  target  activity  and  hence  reduce  the  probability  of 
detection  and  introduce  false  alarms.  These  benign  activi¬ 
ties  are  often  as  ill-defined  as  the  target  activity.  The  com¬ 
pound  effect  of  these  challenges  makes  the  fusion  of  a  set  of 
cross-modal  sensors  essential  to  detecting  such  an  activity. 

The  Decision  Support  Tool  (DST)  presented  in  this  paper 
was  developed  to  support  a  decision  maker  in  the  search  for 
such  an  activity.  The  high-level  user  requirements  ad¬ 
dressed  during  the  design  of  the  DST  included: 

1.  Sensor  processing  algorithms  capable  of  determining 
the  relevance  of  the  data  collected  by  a  set  of  cross- 
modal  sensors  to  possible  target  activities. 

2.  An  information  fusion  centre  capable  of  combining  the 
information  output  from  the  sensor  processing  algo¬ 
rithms  to  determine  the  overall  belief  that  each  candi¬ 
date  is  carrying  out  the  target  activity. 

3.  A  generic  software  framework  implementing  algorithms 
that  are  free  of  inbuilt  hypotheses  and  assumptions  that 
could  reduce  the  detection  sensitivity  of  the  system. 

4.  A  graphical  user  interface  (GUI)  capable  of:  visualising 
current  and  historical  situational  awareness  pictures  and 
sensor  deployments;  and,  interrogating  the  underlying 
sensor  data. 

Hidden  Markov  Models  (HMMs)  are  a  natural  choice  for 
mathematically  describing  situations  where  there  is  a  se¬ 
quence  of  hidden  states  of  the  world  observed  only  through 
noisy  sensor  measurements.  Making  an  assumption  of 
Markovian  structure,  the  possible  target  and  benign  se¬ 
quences  of  sub-activities  that  could  be  being  carried  out  by 
each  candidate  can  be  modelled  as  HMMs.  This  allows  the 
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likelihood  that  each  candidate  is  carrying  out  the  target 
activity  to  be  calculated  given  an  incomplete  sequence  of 
noisy  sensor  measurements.  This  process  is  formally  de¬ 
scribed  in  Section  2.1. 

For  some  sensors  there  are  no  signatures  of  sub-activities 
available  that  can  instantaneously  identify  the  activity;  how¬ 
ever  the  time  series  of  some  physical  characteristic  of  a 
candidate  carrying  out  the  target  activity  may  be  expected  to 
appear  anomalous  when  compared  to  the  time  series  of  the 
same  physical  characteristic  for  ah  other  (innocent)  candi¬ 
dates  over  time.  This  makes  the  HMM  approach  unsuitable. 
In  these  cases  a  time  series  anomaly  detection  (TSAD)  algo¬ 
rithm  is  used  to  produce  an  anomaly  score  for  each  candi¬ 
date  over  time.  Assuming  the  relationship  between  these 
anomalies  and  the  possible  target  activities  is  known,  this 
anomaly  score  can  be  used  to  calculate  the  likelihood  that 
each  candidate  is  carrying  out  the  target  activity.  This  proc¬ 
ess  is  formally  described  in  Section  2.2. 

Bayesian  probability  provides  a  natural  information  fusion 
framework  when  probabilistic  information  is  available  from 
the  processing  of  a  set  of  cross-modal  sensors.  In  particular 
the  combination  of  the  probabilistic  belief  about  the  true 
activities  carried  out  by  each  candidate  based  upon  the  out¬ 
put  of  HMMs  and  TSAD  algorithms  can  easily  be  stated  in 
terms  of  Bayesian  probability.  Furthermore,  Bayesian  prob¬ 
ability  naturally  allows  the  incorporation  of  prior  knowledge 
about  the  expected  number  of  candidates  carrying  out  the 
target  activity.  The  highly  efficient  Bayesian  algorithm  that 
was  designed  and  implemented  for  the  DST  is  described  in 
Section  2.3. 

The  DST  is  a  concept  demonstrator  which  has  been  de¬ 
veloped  to  prototype  status  using  two  core  COTS  software 
applications  customised  to  add  bespoke  additional  function¬ 
ality.  The  system  architecture  and  GUI  interfaces  are  dis¬ 
cussed  in  Section  3.  The  visualisation  options  available  to 
the  decision  maker  to  symbolise  the  probabilistic  fusion 
outputs  on  top  of  imagery  of  the  search  area  are  described  in 
Section  3.3.  The  bespoke  data  interrogation  and  sensor  de¬ 
ployment  tracking  features  that  were  developed  are  de¬ 
scribed  in  Section  3.3. 

In  Section  4  we  summarise  the  performance  of  the  DST 
and  in  Section  5  we  discuss  some  aspirational  future  devel¬ 
opments. 

2  Core  DST  components 

2.1  HMM 

A  HMM  describes  a  system  which  at  any  time  is  in  one  of 
N  distinct  hidden  states.  At  regularly  spaced,  discrete  times, 
the  system  undergoes  a  change  of  state  according  to  a  set  of 
state  transition  probabilities.  These  hidden  states  can  only 
be  observed  indirectly  through  noisy  sensor  observations.  In 
the  case  of  a  first  order  HMM,  which  we  consider  here,  the 
state  transition  probabilities  depend  only  on  the  preceding 
state,  not  the  whole  history  of  the  hidden  process.  Given  a 
possibly  incomplete  sequence  of  noisy  observations  the 


well-known  forward  algorithm  ([2]  pp.  203-206)  is  used  to 
calculate  the  likelihood  that  the  sequence  of  observations 
was  produced  by  the  underlying  HMM.  HMMs  provide  a 
natural  framework  in  which  to  describe  the  remote  detection 
of  a  target  activity  whose  structure  is  uncertainly  known. 

More  formally,  a  HMM  is  a  quintuple  (S,  X,  A,  B,  n), 
where  n  =  ( A,B,n )  represents  the  set  of  model  parameters, 
i.e.  the  state  transition  probability  matrix,  observation  prob¬ 
ability  matrix  and  state  prior  probability  vector.  S  denotes 
the  set  of  possible  hidden  states  and  X  denotes  the  set  of 
possible  observations.  In  the  DST  S  =  {SP,  ~^SP}  denotes 
the  hidden  states  Signal  present  and  Signal  not  present ,  i.e. 
for  a  given  sensor  the  activity  being  carried  out  by  the  can¬ 
didate  is  either  emitting  or  not  emitting  a  signal  that  can  be 
recognised  by  the  sensor  as  being  a  signature  of  the  target 
activity.  Similarly,  X  =  {SD,  -■ SD }  denotes  the  observations 
Signal  detected  and  Signal  not  detected.  Let  qt  and  ot 
denote  the  hidden  state  and  observation  at  time  t.  All 
HMMs  used  in  the  DST  have  the  structure  shown  in  Figure 
1  where  the  top  four  state  transition  probabilities  are  of  the 
form  p(qt\  qt-i)  and  the  bottom  four  detection  probabilities 
are  of  the  form  p(ot\qt).  Note  that  p(SD\SP)  and 
p(SD  |  -'SP)  denote  the  sensor  probabilities  of  detection  and 
false  alarm  respectively. 


Figure  1:  DST  HMM  structure 

The  state  transition  probability  matrix  A  stores  the  probabil¬ 
ity  that  hidden  state  i  follows  hidden  state  j: 

A  =  [au],  atJ  =  p(q,  =  Sj  \  q,_t  =  s,.).  (1) 

Note  that  the  transition  probabilities  are  independent  of  time 
(Markov  assumption);  they  will  however  depend  on  the 
sensor  to  which  the  HMM  corresponds.  The  observation 
matrix  B  stores  the  probability  of  observation  k  being  pro¬ 
duced  from  state  i  and  is  again  independent  of  time: 

B  =  [bt {k)\ ,  b, (k)  =  p(ot=xk\q,=  st).  (2) 
The  state  prior  probability  matrix  n  stores  the  probability 
that  at  time  t=l  the  hidden  state  is  s/. 

x  =  \x,l  X,  =p(ql  =Sj).  (3) 
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For  a  sequence  of  observations  0=o1,o2,...,oT  the  forward 
algorithm  is  used  to  compute  the  probability  p(0\  U).  This 
problem  can  be  viewed  as  evaluating  how  well  the  HMM 
predicts  the  given  observation  sequence. 

The  probability  of  the  observation  sequence  O  given  a 
specific  state  sequence  Q  is: 

p{0\QJY)  =  f[p(o,\qt,U) 

'=*  (4) 

=  V0i)X‘'‘xM°7’)- 

The  probability  of  the  state  sequence  is: 

p(Q,H)  =  7r  b  (o,)a  b  b  (oT). 

7  <1\  <1\  v  17  q2  V  2  7  q T-VqT  qT  V  T  ' 

So  we  can  calculate  the  probability  of  the  observation  se¬ 
quence  given  II  as: 

p(0\n)  =  Y.P{0\Q,U)P{Q\IY) 

Q 

=  I  \\  (0iHj,!2^2  (°i)—  (5) 

Evaluating  this  probability  directly  would  have  computa¬ 
tional  complexity  that  is  exponential  in  the  number  of  time 
steps;  therefore,  an  algorithm  called  the  forward  algorithm 
(a  case  of  the  Expectation  Maximisation  (EM)  algorithm  [4] 
pp.  124)  is  used  to  evaluate  the  probability  recursively.  We 
define 


(0  =  p(0i02  ■■■°n‘L=si  I  n)>  (6) 

i.e.  the  probability  of  the  partial  observation  sequence 
o1o2...ot  and  the  state  sf  at  time  t  given  the  model.  After 
initialization  a  is  calculated  recursively  as  a  sum  over  all 
states  at  the  previous  time  step.  The  sum  of  all  values  of  a 
at  the  final  time  step  will  equal  the  probability  of  obtaining 
the  observation  sequence  given  the  model.  More  formally: 

1.  Initialisation:  O  may  contain  missing  observations 
(for  which  we  will  use  the  MATLAB  inspired  nota¬ 
tion  NaN);  in  this  case  we  sum  over  all  possible 
observations 

f  n h  (o, )  if  o,  ^  NaN 

*'«  =  {  *  otherwise  <7> 

for  1  <  i  <  N. 

2.  Induction: 


«,+iO')  = 


|  E£i  a,  ( i)a,j  ]bj  (o,+1 )  if  ot+l  +  NaN 
|  Eh  O'.,  ( i)a,j  ]  otherwise 


(8) 


for  \  <t<T-\,  \  <  j<  N. 


3.  Termination: 


p{0\YY)  =  Y.aT(i).  (9) 

J=1 

By  calculating  a  as  the  sum  over  all  states  at  the  previous 
time  step  we  reduce  the  complexity  of  the  calculations  in¬ 
volved  from  2Nt  T  to  N2  T. 

For  a  sequence  of  observations  0=o1,o2,...,oT  from  a  spe¬ 
cific  single  sensor  input  to  the  DST  the  forward  algorithm  is 
used  to  compute  the  probabilities  p(0\  IIr)  and  p(0\  UB)  for 
two  HMMs  nR  and  nB.  The  subscripts  R  and  B  denote  that 
the  activity  being  carried  out  by  the  observed  candidate  is 
the  target  activity  (referred  to  as  RED)  and  or  some  other 


benign/innocent  activity  (referred  to  as  BLUE).  I1R  de¬ 
scribes  the  transition  of  the  hidden  states  (sub-activities)  of 
the  RED  activity  detectable  by  the  sensor  and  the  sensor 
performance  in  detecting  these  hidden  states.  nB  performs 
the  same  role  for  an  activity  representing  all  possible  BLUE 
activity.  This  problem  can  be  viewed  as  evaluating  how 
well  each  of  the  RED  and  BLUE  activity  models  predict  a 
given  observation  sequence.  The  output  from  the  HMM 
processing  of  an  observation  sequence  for  a  single  building 
is  the  likelihood  ratio 


p{o\uBy 


(10) 


2.2  TSAD 

For  some  global  sensor  and  target  activity  combinations 
the  relationship  between  sensor  observations  and  the  target 
or  background  activities  is  poorly  understood.  For  example 
the  instantaneous  trajectory  of  an  individual  intending  to 
attack  a  military  base  may  be  indistinguishable  from  those  of 
the  innocent  surrounding  population;  however,  the  route  and 
movement  track  used  by  the  individual  over  a  period  of  time 
may  appear  anomalous  when  compared  to  the  model  of 
normality  formed  by  monitoring  all  other  individuals  This 
means  that  traditional  pattern  recognition  methods  and  the 
HMM  method  described  in  Section  2.1  are  unable  to  calcu¬ 
late  the  likelihood  that  the  sensor  observations  received  are 
due  to  the  presence  of  the  target  activity;  however,  it  may  be 
expected  that  the  patterns  of  sensor  observations  of  candi¬ 
dates  carrying  out  target  activities  will  appear  anomalous 
when  compared  to  the  patterns  of  sensor  observations  of  all 
other  candidates  over  time. 

For  sensors  in  this  category  the  DST  takes  as  input  for 
each  candidate  a  multi-dimensional  time  series  of  independ¬ 
ent  statistics  of  measured  physical  emissions.  For  each  pair 
of  candidates  the  similarity  between  their  time  series  is  cal¬ 
culated  using  the  Dynamic  Time  Warping  (DTW)  algorithm. 
DTW  ([4]  pp.  85)  is  a  time  series  similarity  measure  that  is 
able  to  recognise  two  time  series  as  similar  when  one  is 
merely  a  non-linear  temporal  warping  of  the  other.  This 
property  makes  DTW  a  suitable  measure  of  similarity  for 
detecting  anomalous  patterns  of  activity  when  two  ‘normal’ 
candidates  may  be  following  non-linear  temporal  warpings 
of  the  same  underlying  pattern  of  activity.  Based  on  their 
time  series  similarity  to  all  other  candidates  the  candidates 
are  clustered.  There  are  many  possible  clustering  algorithms 
that  can  be  used  to  cluster  time  series  similarity  measures 
[3];  we  defer  a  complete  description  of  the  algorithm  em¬ 
ployed  in  the  DST  to  a  future  publication.  In  essence,  an 
anomaly  score  can  be  calculated  for  each  candidate  based  on 
how  much  it  is  an  outlier  to  each  of  the  resulting  clusters. 
Using  historical  data  or  expert  knowledge,  likelihood  mod¬ 
els  can  be  constructed  to  calculate  the  likelihoods  that  the 
anomaly  score  obtained  would  be  due  to  a  RED  activity 
being  carried  out  or  a  BLUE  activity  being  carried  out  by 
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the  candidate.  For  each  candidate,  this  can  be  expressed  as 
the  likelihood  ratio 

p (anomaly  score | RED) 
p  (anomaly  score  |  BLUE)  ^  ^ 

These  likelihood  ratios  constitute  the  TSAD  input  to  the 
central  fusion  algorithm. 


2.3  Bayesian  fusion  algorithm 


Let  {dx,...,dN}  denote  the  set  of  N  candidates  under  ob¬ 
servation  each  of  which  is  persistently  in  either  the  state 
RED  ( R )  or  BLUE  ( B );  meaning  a  target  activity  either  is  or 
isn’t  being  carried  out  by  the  candidate  respectively. 
Let {vj ,..., vM }  denote  the  set  of  M  observing  sensors.  At  a 
given  moment  in  time  each  sensor  v.  processes  all  observa¬ 
tions  received  for  each  candidate  and  produces  an  array 
of  continuous  variables  ztj  .  Each  ztj  could  also  be  a  vector, 

e.g.  a  time  series  of  observations,  and  what  follows  holds 
directly  for  vectors.  For  each  sensor  v.  and  each  candi¬ 


date  dj ,  z..  is  processed  by  either  an  HMM  or  the  TSAD 


algorithm  to  produce  the  likelihood  ratio 
L  _P(Zy\R) 

‘J  P(zij\B) ' 


(12) 


Note  that  Vz,y,  A.  e  [0,  °o] ;  L..  =1  indicates  no  informa¬ 
tion;  Ltj  « 1  indicates  that  candidate  dj  is  likely  to  be  in 
state  B\  and,  Z.  »  1  indicates  that  candidate  dj  is  likely  to 
be  in  state  R.  Furthermore,  let 


M  p(zJR ) 

L  =n^  ; 

7  T-ip(zv\B) 


(13) 


denote  the  product  of  likelihood  ratios  for  candidate  dj 
produced  by  all  sensors  v.,  i  =  1...M  .  The  persistent  state 
of  all  N  candidates  can  be  described  by  a  binary  vector  A 
with  N  elements.  For  example,  A  =  (1 00 1 0 . . .)  indicates  that 


candidate  1  is  R ,  candidates  2  and  3  are  B ,  candidate  4  is  R 
etc.  There  are  a  total  of  1N  possible  persistent  states  of  the 
world.  Furthermore,  the  number  of  distinct  state  vectors 
with  exactly  n  elements  in  state  R  is 


C(N,n)  = 


N\ 

n\(N—n)\ 


(14) 


Any  distribution  may  be  assumed  for  the  expected  number  n 
of  candidates  in  state  R.  The  prior  probability  of  a  particular 
state  of  the  world  A  is  given  by 

p(%)  =  Z„p(A,n)  =  Z„p(A |  n)p(n)  (15) 


where  n  is  the  number  of  candidates  in  state  R  and 
f  constant  if  A  is  compatible  with  n 
I  0  otherwise 


p(A  |  n) : 


(16) 


Since,  a  priori,  all  A  with  n  candidates  in  state  R  are  indis¬ 
tinguishable,  and  there  are  C(A,zz)  of  them  we  have 


1 


p(A  |  n)  =  \  C(N,n ) 
0 


if  A  is  compatible  with  n 
otherwise 


(17) 


The  posterior  probability  of  the  true  state  of  the  world  being 
a  particular  A  based  upon  the  data  observed  by  sensor  y.  is 
given  by 

p(A\Zi)ocp(Zi\A)p(A)  (18) 

where  Z.  is  the  vector  of  sensor  outputs  for  all  candidates 
from  sensor  i,  i.e.  Z  =  (zn,zi2,...,ziN)  .  If  we  assume  the 
outputs  from  sensor  i  are  mutually  conditionally  independent 
between  candidates,  then 

p(z, \Z)=n  p^y  iv./))  ( 1 9) 

where  A(j)  is  the  y'-th  element  of  the  state  vector.  Further¬ 


more,  if  we  assume  the  outputs  from  all  sensors  are  mutually 
conditionally  independent  for  a  given  candidate,  then 

p(Z\A)  =  Up(Zi\A)  (20) 


where  Z  is  the  vector  Z  =  (Zx ,  Z2 , . . . ,  ZM  ) .  Substituting  this 
into  Bayes  rule  and  dividing  through  by  YljYljp(zij  \  B)  ,  we 
have 

p{A\Z)  J  n  Lj)p(A).  (21) 

VZOH  J 

Note  that  this  is  a  product  of  the  likelihood  ratios  of  candi¬ 
dates  in  state  R  as  specified  by  A .  Having  calculated 
p(A  |  Z)  we  are  in  a  strong  position  to  obtain  many  useful 


probabilities.  For  example,  the  probability  that  candidate 
dj  is  in  state  R  (irrespective  of  the  states  of  the  other  candi¬ 


dates)  is  given  by  the  sum 

i  P(m 

A:A(j)=l 


(22) 


where  the  sum  is  over  all  A  for  which  candidate  dj  is  in 
state  R. 

For  large  candidate  numbers  it  is  not  computationally  fea¬ 
sible  to  implement  this  Bayesian  framework  in  a  “brute 
force”  manner  that  explicitly  calculates  p(A  \  Z )  individually 
for  every  possible  A  .  The  time  complexity  of  such  an  algo¬ 
rithm  is  0(N2n'1).  Therefore  a  novel  and  efficient  algorithm 
with  time  complexity  OfN4)  was  developed  to  greatly  reduce 
the  required  number  of  computations  to  calculate  the  prob¬ 
abilities  given  by  Equation  (22).  The  algorithm  is  based 
upon  the  following  proposition: 


Proposition  1 :  Let  A  be  the  NxN  matrix  whose  zy-th  ele¬ 
ment  (in  the  z-th  row  and y-th  column)  is  given  by 

i-1  (23) 

4j  =  41  Aj  .  0>1)- 

r—\ 

and  let  AN  *  denote  the  last  row  of  A.  Let  P  denote  the 
vector (Pl9P2,...,PN)  with  Pm  being  the  prior  probability 
that  a  particular  A  for  which  precisely  m  candidates  are  in 
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persistent  state  R  represents  the  true  state  of  the  world.  That 
is 


pm  =  r(M  ,  Pim)  from  (17). 


(24) 


Let  P*  An*  denote  the  scalar  product  of  P  and  AN  * .  Then: 


p»an,= 


I  p(Z\X)p{X) 

A:A(N)= 1 


(25) 


Proof:  Full  proof  omitted  for  brevity;  the  following  is  a 
sketch  proof  of  this  result.  We  first  define  some  notation. 
Let  Zn(w  n  denote  the  sum  of  all  C(n ,  j)  possible  products 

of  the  combinations  of  j  elements  of  the  set  of  likelihood 
ratios  {Lx ,  L2 , . . . ,  Ln } .  For  example, 

^11(3,2)  =  A  A  A  A  AA  •  (26) 

It  can  be  shown  that  the  ijth  element  of  A  is  equal  to 
Z.Xn(._1  1}  which  is  clearly  the  sum  of  all  C(z,  j)  products 

of  the  combinations  of  j  elements  of  {Z1,Z2,...,L.}  that 
include  L.  .  An  inductive  proof  can  be  given  that  the  sum  of 

the  mth  column  of  the  first  n  rows  of  A  is  precisely  the  sum 
of  all  C(n,m)  products  of  likelihood  ratios  corresponding  to 
combinations  of  m  elements  of  {Lx ,  L2 , . . . ,  Ln } .  That  is 


Z4 


(27) 


(28) 


(29) 


From  Equation  (21)  it  can  be  shown  that 

p(z  i  A)  =  n  h- 

MU)= 1 

Given  Equations  (23),  (27)  and  (28),  it  follows  that 

4,„  =  Z  /J(Z|2) 

A:A(N)= 1,  |A|=m 

where  |  >1 1  denotes  the  number  of  candidates  in  persistent 
state  R  in  A  .  The  result  given  in  Equation  (25)  is  derived  as 
follows: 

Z  p(z\*)p(A) 

X:A(N)= 1 

=  Z  Z  ^(z  I  X)p(&  |  /n)p(m) 


m  \A\ =m 


1 


=  Z  Z  p(Z^]r<M  > 

m  X*(N)=\,  |A|=m  C  (y  V  ,  7W) 

=  Z  *  ^  P(w)  Z  ^ziz) 


/?(m)  from  (17) 


(30) 


'  C(N,m) 

=Z^  z 

=  Z^A>,  = 


A:A(V)=l,|A|=m 

1 2) 


from  (29) 


n 

The  usefulness  of  Proposition  1  is  in  noticing  that  A  can  be 
calculated  recursively  with  time  complexity  0(N3).  If  we 
cyclically  permute  (LX,L2,...,LN)  N- 1  times  and  calculate  A 
after  each  permutation  we  will  have  calculated  Equation 


(25)  for  each  target.  This  allows  us  to  then  calculate  Equa¬ 
tion  (22)  for  each  target  as  the  normalization  constant  for 
Bayes  rule  will  simply  be  the  sum  of  the  scalar  product  for 
all  instances  of  A  and  P0 ,  the  prior  probability  that  there  are 
no  candidates  in  persistent  state  R.  Full  details  of  the  proof 
of  this  proposition,  the  time  complexity  calculations  and  the 
avoidance  of  overflow  issues  in  the  algorithmic  implementa¬ 
tion  will  be  given  in  a  forthcoming  publication  by  J.  E. 
Barker  and  D.  J.  Salmond. 

3  System  architecture  and  visualisation 

3.1  System  overview 

The  DST  presented  here  is  a  prototype  concept  demon¬ 
strator  based  on  the  HMM  forward  algorithm,  TSAD  and 
Bayesian  fusion  algorithm  described  in  Section  2.  The  con¬ 
cept  of  use  for  the  DST  is  as  an  aid  to  a  decision  maker 
leading  a  search  for  an  activity,  of  the  type  described  in 
Section  1 ,  possibly  being  carried  out  by  an  unknown  number 
of  individuals  within  a  large  set  of  candidates.  Through  the 
course  of  the  search  the  decision  maker  will  task  local  sen¬ 
sors  to  remotely  observe  small  numbers  of  candidates.  The 
data  collected  by  each  local  sensor  will  be  processed  after 
each  tasking  and  the  sequence  of  Signal  detected  and  Signal 
not  detected  observations  will  be  updated  for  each  candi¬ 
date.  The  decision  maker  may  also  task  global  sensors  to 
remotely  observe  some  physical  emissions  from  all  candi¬ 
dates  over  time;  the  processed  data  from  each  global  sensor 
will  be  used  to  update  a  multi-dimensional  real-valued  time 
series  stored  for  each  candidate.  On  the  presentation  of  new 
or  updated  data  the  DST  processes  the  updated  observation 
sequences  or  time  series  for  each  candidate  using  the  HMM 
forward  algorithm  or  TSAD.  The  updated  likelihood  ratios 
for  all  candidates  are  processed  through  the  Bayesian  fusion 
algorithm  on  a  single  sensor  basis  and  across  the  complete 
set  of  sensors.  The  output  of  this  is  a  set  of  updated  prob¬ 
abilities  that  each  candidate  is  carrying  out  the  target  activity 
based  upon  each  of  the  individual  sensors  as  well  as  the 
fusion  of  all  sensors;  these  probabilities  are  stored  within  a 
central  common  data  store  as  shown  in  Figure  2.  The  HMM 
forward  algorithm,  TSAD,  Bayesian  fusion  algorithm  and 
required  data  handling  and  storage  functions  are  imple¬ 
mented  in  the  COTS  software  The  Mathwork’s  MATLAB. 
The  probabilities  stored  in  the  common  data  store  are  then 
accessed  by  the  ESRI  Geographic  Information  System  (GIS) 
ArcMap  for  visualisation. 

On  initiation  of  the  fusion  system  the  decision  maker  can 
choose  to  operate  the  system  in  one  of  two  separate  high- 
level  components:  the  Standard  Fusion  system  or  the  ‘What- 
If  Fusion  (WIF)  system.  The  complete  process  described 
above  constitutes  the  Standard  Fusion  system.  The  WIF 
system  was  developed  to  allow  alternative  hypotheses  about 
the  parameters  of  the  target  and  background  activities  and 
sensor  performance  to  be  investigated  during  the  search 
without  altering  the  core  Standard  Fusion  system.  Essen- 
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Figure  2:  DST  workflow  diagram 


tially  the  Standard  Fusion  system  can  be  seen  as  utilising  the 
best  possible  estimate  of  the  HMM  parameters  and  Bayesian 
prior  probabilities  given  available  historical  data  and  expert 
knowledge  of  the  target  activity;  whereas  the  WIF  system 
allows  the  decision  maker  to  explore  the  impact  on  the 
probabilistic  outputs  of  changes  to  these  assumed  parame¬ 
ters.  This  option  is  represented  by  decision  node  1  in  Figure 
2  and  is  made  through  the  Fusion  GUI  shown  in  Figure  3. 
The  GUI  runs  in  MATLAB  but  is  activated  directly  from 
ArcMap  via  a  Visual  Basic  for  Applications  (VBA)  script. 


r.™.  ■UMUnwSIMSr 


Figure  3:  DST  screenshot  with  GUI 

3.2  Standard  fusion 


search  the  decision  maker  provides  a  start-up  file  to  the  DST 
containing  the  following  information: 

•  A  list  of  Sensors; 

•  Each  sensor’s  processing  type  (HMM  or  TSAD); 

•  HMM  parameters; 

•  TSAD  parameters  including  likelihood  model 
specification; 

•  Candidate  unique  identifiers. 

This  information  is  used  by  the  DST  to  automatically  as¬ 
sociate  new  sensor  data  files  with  the  correct  data  processing 
algorithm,  either  the  HMM  or  TSAD.  This  functionality, 
producing  formatted  data  dependant  on  sensor  type  and  the 
sensors  associated  processing  parameters,  is  represented  by 
decision  node  2  in  Figure  2.  The  data  is  then  processed  by 
the  associated  algorithm  to  produce  a  likelihood  ratio  of  the 
form  shown  in  Equation  (12)  for  each  candidate.  These 
single  sensor  likelihood  ratios  are  then  processed  through 
the  Bayesian  fusion  algorithm  to  provide  a  probability  that 
each  candidate  is  carrying  out  the  target  activity  based  upon 
that  single  sensor’s  observations.  The  products  all  sensor 
likelihood  ratios  for  each  candidate  (Equation  (13))  are  then 
processed  through  the  Bayesian  fusion  algorithm  to  provide 
a  probability  that  each  candidate  is  carrying  out  the  target 
activity  given  all  sensor  data.  All  calculated  probabilities  are 
output  in  a  format  that  can  immediately  be  symbolised  by 
ArcMap  when  coupled  with  the  candidate  coordinates  at  the 
time  of  observation. 


The  Standard  Fusion  system  has  been  designed  to  make 
no  assumptions  about  the  specific  sensors  which  will  be 
inputting  observations  to  the  system  and  as  such  can  ac¬ 
commodate  any  sensor  whose  data  can  be  meaningfully 
processed  within  the  HMM  or  time  series  anomaly  detection 
frameworks  described  in  Section  2.  Prior  to  initiating  the 


3.3  What-If  fusion 

In  practice  the  HMM  and  TSAD  parameters  and  Bayesian 
prior  probabilities  would  be  the  best  possible  estimates 
based  on  historical  data  and  expert  knowledge  of  the  target 
activity.  Recognising  that  there  may  be  error  or  uncertainty 
in  the  elicited  probabilities  the  WIF  system  was  developed 
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to  allow  alternative  hypotheses  about  the  parameters  of  the 
target  and  background  HMMs  and  sensor  performance  to  be 
investigated  during  the  search  without  altering  the  core 
Standard  Fusion  system.  This  allows  the  decision  maker  to 
explore  the  impact  on  the  probabilistic  outputs  of  changes  to 
the  assumed  parameters.  These  alternative  hypotheses  and 
corresponding  parameters  can  be  generated  during  the 
search  through  a  dedicated  WIF  GUI  and  their  impact  seen 
without  affecting  the  Standard  Fusion  system. 

The  WIF  system  allows  testing  of  hypothesised  scenarios 
that  range  from  the  simple,  such  as  removing  candidates  or 
sensors  from  consideration  by  the  DST,  through  to  the  more 
complex  situation  of  hypothesising  a  specific  target  activity 
and  re-processing  the  sensor  data  with  the  SME  informing 
an  alternative  set  of  transition  probabilities  for  the  HMMs. 
The  WIF  system  can  also  allow  sensor  data  to  be  reproc¬ 
essed  using  alternative  algorithms  to  the  HMM  or  TSAD,  so 
long  as  the  algorithms  are  compatible  with  both  the  Bayes¬ 
ian  fusion  framework  and  the  data  provided  by  the  sensors. 

The  WIF  GUI  allows  the  decision  maker  to  specify  candi¬ 
date  removal  and  sensor  subset  hypotheses.  The  parameters 
required  for  data  processing  under  these  hypotheses  are  then 
specified  through  a  Model  Generation  GUI.  Upon  specifica¬ 
tion  of  a  sensor  subset  all  data  previously  processed  through 
the  Standard  Fusion  GUI  for  those  sensors  is  loaded  from 
the  common  data  store,  the  pre-formatted  data  is  then  re¬ 
structured  based  upon  any  time  constraints,  e.g.  data  re¬ 
ceived  out  of  chronological  order.  The  newly  structured  data 
is  then  passed  from  the  data  restructuring  process  to  the 
standard  data  processing  described  in  Section  3.2. 

3.4  Visualisation 

Visualisation  of  the  probabilistic  outputs  of  the  Standard 
Fusion  and  WIF  systems  is  provided  by  a  customised  GIS 
application  developed  using  the  ESRI  COTS  software 
ArcMap.  The  visualisation  application  includes  the  follow¬ 
ing  functionality: 

•  Display  background  geospatial  information,  e.g. 
aerial  imagery,  mapping  and  polygons  representing 
the  candidates; 

•  Visualise  spatially  and  temporally  referenced  prob¬ 
abilistic  output  of  the  Standard  Fusion  and  WIF 
systems; 

•  Temporal  analysis  of  probabilistic  information  us¬ 
ing  a  time  analysis  extension; 

•  Creation  and  visualisation  of  geographic  represen¬ 
tations  of  sensor  tasking  thus  allowing  temporal 
and  spatial  analysis  of  sensor  deployments. 

The  foundation  of  the  visualisation  tool  is  the  imported 
geospatial  information;  this  provides  the  background  im¬ 
agery  and  underlying  coordinate  system  over  which  the 
probabilistic  information  can  be  symbolised.  The  DST  can 
be  initialised  with  a  single  ortho-rectified,  geo-referenced 
image  and  set  of  polygons  representing  the  shape  and  spatial 
distribution  of  the  candidates.  The  advantage  of  a  GIS  sys¬ 


tem  however  is  in  the  depth  of  information  which  can  be 
supported;  additional  information  such  as  the  infrastructure 
of  the  areas  containing  the  set  of  candidates  can  easily  be 
ingested  into  the  DST  to  provide  a  rich  source  of  back¬ 
ground  information. 

The  free  ArcMap  extension  TimeSlider,  from  Applied 
Science  Associates  (ASA),  was  used  to  allow  rapid  changing 
between  current  and  historical  situational  awareness  pic¬ 
tures.  Using  this  extension  it  is  quick  and  easy  for  the  deci¬ 
sion  maker  to  analyse  the  temporal  trends  in  the  spatially 
referenced  probabilistic  information. 

The  probabilistic  information  (with  associated  geospatial 
coordinates)  is  accessed  by  ArcMap  from  the  common  data 
store  created  by  the  MATLAB  fusion  systems.  The  visuali¬ 
sation  tool  uses  VBA  scripts  to  allow  the  decision  maker  to 
select  and  visualise  a  specific  single  sensor  or  fused  prob¬ 
ability  file.  Increasing  probability  that  a  candidate  is  carry¬ 
ing  out  the  target  activity  is  symbolised  by  a  circular  poly¬ 
gon  of  increasing  radius  and  also  a  colour-map  as  shown  in 
Figure  4.  The  ability  to  visualise  single  sensor  as  well  as 
fused  probabilistic  outputs  aid  the  decision  maker  in  his 
search  by  allowing  him  to  understand  the  relative  contribu¬ 
tions  of  each  sensors  evidence  to  the  fused  probabilistic 
output.  This  insight  will  also  suggest  possible  ‘What-if 
hypotheses  to  generate. 

Sensor  taskings  can  be  captured  in  the  2D  environment 
using  a  semi-automated  process  where  the  deployment  area 
is  created  by  the  user  as  a  polygon  and  details  such  as  sen¬ 
sor  type,  deployment  time  and  deployment  duration  can  be 
associated  with  it.  This  supports  the  decision  maker  in  his 
search  by  allowing  spatio-temporal  analysis  of  sensor  task¬ 
ings  to  ensure  limited  sensing  resources  are  being  utilised 
optimally. 

A  data  interrogation  tool  known  as  the  ‘Dig  Down’  tool 
has  been  to  developed  to  allow  the  decision  maker  to 
quickly  display  plots  of  the  sensor  observations  and  time 
series  data  received  for  a  particular  candidate.  This  tool 
uses  a  VBA  script  to  command  MATLAB  to  plot  data 
drawn  from  the  common  data  store.  This  aids  the  decision 
maker  in  his  search  by  presenting  the  underlying  sensor  data 
in  such  a  way  as  to  indicate  the  driving  factors  behind  the 
probabilistic  outputs. 
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Figure  4:  DST  probability  symbology1 


1  Candidate  polygons  shown  have  been  chosen  to  be  generic 
and  not  refer  to  any  particular  type  of  candidate. 
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4  Summary 

The  concept  demonstrator  DST  presented  in  this  paper 
has  been  tested  through  a  series  of  evaluation  scenarios 
supporting  a  real  decision  maker  but  based  on  synthetic 
data.  The  DST  was  found  to  be  an  effective  and  efficient 
way  to  collate  a  large  number  of  sensor  inputs  over  a  large 
candidate  target  set  and  present  the  decision  maker  with  a 
single  current  and  historical  situational  awareness  picture. 
This  picture  and  the  ability  to  track  its  evolution  over  time 
aided  the  decision  maker  in  prioritising  sensor  deployments 
to  effectively  rule  out  some  candidates  and  increase  the 
focus  on  others.  The  DST  acted  not  only  as  a  record  over 
time  of  the  progress  of  the  search  but  also  as  a  record  of 
sensor  taskings  and  the  volume  of  data  collected  against 
each  candidate.  The  data  interrogation  features  provided  by 
the  single  sensor  view  and  the  ‘Dig  Down’  tool  were  found 
to  be  essential  in  allowing  the  decision  maker  to  understand 
the  relative  contributions  of  different  sensors  evidence  to  the 
overall  belief  attributed  to  each  candidate. 

5  Future  Development 

The  current  version  of  the  DST  is  only  able  to  process 
sensor  observations  and  display  the  resulting  probabilistic 
information  at  the  individual  candidate  level.  In  practice, 
many  sensors  will  return  observations  at  different 
granularities  of  the  candidate  set,  e.g.  specific  to  sub¬ 
features  of  the  candidates  or  only  specific  to  a  sub-set  of  all 
candidates.  The  process  of  associating  all  observations  to 
individual  candidates  causes  a  loss  of  information  between 
the  raw  sensor  observations  and  the  data  maintained  and 
processed  by  the  DST.  Research  is  now  underway  to  under¬ 
stand  how  this  information  can  be  retained  and  exploited  in 
a  future  version  of  the  DST  based  on  a  hierarchical  Bayes¬ 
ian  fusion  framework. 

The  current  2D  visualisation  tool  has  the  limitation  that 
only  a  single  probability  against  each  individual  candidate 
polygon  can  be  displayed  at  a  specific  time  step.  In  practice 
it  is  often  important  to  the  decision  maker  to  be  able  to 
display  more  specific  information  about  the  candidate  such 
as  its  structure  or  status  at  the  time  of  observation  as  these 
may  be  influence  the  relevance  of  the  data  collected  to  the 
search.  Currently  there  is  no  mechanism  for  displaying 
such  information.  Options  such  as  transitioning  the  DST  to 
a  3D  environment  such  as  Esri’s  Arc  Scene  are  being  inves¬ 
tigated  to  provide  these  capabilities. 

To  support  the  large  volumes  of  data  that  will  be  gener¬ 
ated  by  the  improved  DST  visualisation  system  a  more 
intelligent  geo-spatially  referenced  database  system  will  be 
required.  This  database  should  allow  data  to  be  attributed  at 
multiple  levels  of  granularity  as  described  above.  This  da¬ 
tabase  will  operate  in  a  client-server  environment  that  will 
allow  alterations  to  be  made  by  a  number  of  different  cli¬ 
ents  through  specific  applications.  Applications  will  in¬ 
clude:  an  interactive  tool  for  sensor  tasking  queries  which 


will  provide  information  on  data  received  and  data  re¬ 
quested;  and,  a  sensor  deployment  tool  that  will  allow  map¬ 
ping  of  sensor  coverage  and  line  of  sight. 

The  HMM  processing  within  the  DST  is  currently  applied 
to  each  sensor  observation  sequence  separately.  In  practice 
this  means  that  the  likelihood  of  obtaining  each  sensor 
observation  sequence  is  not  calculated  with  an  assumption 
of  a  single  underlying  sequence  of  hidden  states  which  all 
sensors  are  observing.  Ignoring  this  dependency  between 
sensor  observations  means  additional  constraints  on  the 
possible  underlying  hidden  state  sequence  may  be  being 
ignored.  Work  is  underway  to  understand  the  maximum 
amount  of  information  which  can  be  extracted  from  the 
observation  sequences  across  all  sensors  and  the  impact  that 
this  will  have  on  the  required  sensor  processing  algorithms. 

The  current  anomaly  detection  process  makes  an  as¬ 
sumption  that  p(data|RED)=p(data|anomaly  score).  It  was 
recognised  during  development  of  the  DST  that  this  is 
unlikely  to  be  true  in  practice  and  the  relationship  between 
anomality  and  RED  will  be  more  complicated;  for  some 
sensor  and  target  activity  combinations  it  may  in  fact  be  the 
case  that  RED  candidates  do  not  appear  anomalous  at  all. 
Understanding  this  relationship  depends  upon  understand¬ 
ing  the  background  activity  taking  place  across  the  areas 
containing  the  set  of  candidates  and  the  phenomenology  of 
the  sensor  and  target  activity  combination.  For  many  appli¬ 
cations  this  information  may  not  available  prior  to  initiating 
the  search.  To  solve  this  complex  problem  we  propose  to 
separately  track  three  hypotheses  within  the  DST:  non¬ 
target,  target  and  anomalous  where  the  target  set  and 
anomalous  set  are  possibly  intersecting  subsets  of  the  set  of 
all  candidates. 
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